Skip to content

feat: add DecodeFieldError/ReadFileError wrappers for error context#6128

Open
koriyoshi2041 wants to merge 2 commits into
lance-format:mainfrom
koriyoshi2041:fix/error-context-wrapping
Open

feat: add DecodeFieldError/ReadFileError wrappers for error context#6128
koriyoshi2041 wants to merge 2 commits into
lance-format:mainfrom
koriyoshi2041:fix/error-context-wrapping

Conversation

@koriyoshi2041
Copy link
Copy Markdown

Summary

Adds two wrapper error types to provide better context when decoding errors occur, as described in #5602:

  • DecodeFieldError (lance-encoding/src/decoder.rs): wraps errors with field name and field id. Applied at field scheduler creation in both structural and legacy decode paths (Struct, List, FixedSizeList, Map).
  • ReadFileError (lance-file/src/reader.rs): wraps errors with file path. Applied at FileReader::read_tasks and read_stream_projected_blocking entry points. FileReader now stores the file path.

Both implement std::error::Error with proper source chaining and From<...> for lance_core::Error, producing error chains like:

Error: failed to read file 'data.lance'

Caused by:
    0: failed to decode field 'age' (id=10)
    1: number out of range

Design follows the approach from https://sabrinajewson.org/blog/errors as referenced in the issue.

Test plan

  • Unit tests for DecodeFieldError display, source chain, and conversion to lance_core::Error (3 tests)
  • Unit tests for ReadFileError display, source chain with nested DecodeFieldError, and conversion to lance_core::Error (3 tests)
  • cargo fmt --check passes
  • cargo clippy --all-targets -p lance-encoding -p lance-file passes
  • All existing tests pass

Closes #5602

@github-actions github-actions Bot added the enhancement New feature or request label Mar 8, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Mar 8, 2026

PR Review

Good approach following the error wrapping pattern from the linked blog post. The DecodeFieldError / ReadFileError types with proper source() chaining will meaningfully improve debuggability.

Issues

P1 — Incomplete wrapping in legacy decode path

The structural path wraps child field errors for Struct, List, FixedSizeList, and Map. But in the legacy path, only Struct children (line ~990) get wrapped. The legacy create_list_scheduler (line 634) calls create_legacy_field_scheduler for the list's child without wrapping:

let items_scheduler =
    self.create_legacy_field_scheduler(&list_field.children[0], column_infos, buffers)?;

This should also wrap with DecodeFieldError for consistency. Same for any other legacy compound types (FixedSizeList, etc.) that recurse into children.

P1 — io_source loses original error category

Both From<DecodeFieldError> and From<ReadFileError> convert via Error::io_source(). This means if the original error was e.g. Error::invalid_input(...), the wrapped error becomes an IO variant. This changes the error category, which could break callers that match on error variants for control flow. Consider preserving the original error's category or using a category-agnostic wrapping mechanism.

Nit

The #[cfg(test)] block had a comment // test coalesce indices to ranges removed — this looks like an accidental drive-by deletion (the comment described the test below it).

@koriyoshi2041
Copy link
Copy Markdown
Author

Thanks for the thorough review! All three issues have been addressed in the latest commit:

  1. P1 — Legacy decode path: Added DecodeFieldError wrapping in create_list_scheduler for child field errors, matching the structural path's behavior.

  2. P1 — Error category preservation: Changed From<DecodeFieldError> and From<ReadFileError> to use Self::wrapped() instead of Self::io_source(), so the original error category is preserved in the chain.

  3. Nit — Restored comment: Put back the // test coalesce indices to ranges comment above mod tests.

All 428 tests pass (362 lance-encoding + 66 lance-file).

…r context

Add wrapper error types that provide contextual information when decoding
errors occur:

- DecodeFieldError: wraps errors with field name and field id, applied at
  field scheduler creation points in both structural and legacy decode paths
- ReadFileError: wraps errors with file path, applied at FileReader's
  read_tasks and read_stream_projected_blocking entry points

This produces error chains like:
  failed to read file 'data.lance'
    Caused by:
      0: failed to decode field 'age' (id=10)
      1: number out of range

Closes lance-format#5602
- Wrap child field errors with DecodeFieldError in legacy list decode
  path (create_list_scheduler), matching the structural path's coverage
- Use Error::wrapped() instead of Error::io_source() in both
  From<DecodeFieldError> and From<ReadFileError> to preserve the
  original error's category instead of converting everything to IO
- Restore accidentally deleted "test coalesce indices to ranges" comment
@koriyoshi2041 koriyoshi2041 force-pushed the fix/error-context-wrapping branch from f0d2e67 to 65f170d Compare May 29, 2026 17:17
@koriyoshi2041
Copy link
Copy Markdown
Author

Rebased onto current main (the branch had drifted behind). The only conflicts were in the decoder.rs / reader.rs test modules where main had since added neighbouring tests — kept both sets. The production changes (the DecodeFieldError / ReadFileError wrappers and the map_err sites) applied cleanly.

Re-verified locally: cargo test -p lance-encoding -p lance-file (the 6 wrapper tests pass), cargo clippy -p lance-encoding -p lance-file --tests, and cargo fmt are all clean. Ready for another look when you have a moment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add ReadFileError and DecodeColumnError wrappers to provide better context for decoding errors

1 participant